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ABSTRACT 

Imprinted retrotransposed genes share a common 
genomic organization including a promoter- 
associated differentially methylated region (DMR) 
and a position within the intron of a multi-exonic 
'host' gene. In the mouse, at least one transcript 
of the host gene is also subject to genomic imprint- 
ing. Human retrogene orthologues are imprinted 
and we reveal that human host genes are not 
imprinted. This coincides with genomic rearrange- 
ments that occurred during primate evolution, which 
increase the separation between the retrogene 
DMRs and the host genes. To address the mech- 
anisms governing imprinted retrogene expression, 
histone modifications were assayed at the DMRs. 
For the mouse retrogenes, the active mark 
H3K4me2 was associated with the unmethylated 
paternal allele, while the methylated maternal allele 
was enriched in repressive marks including 
H3K9me3 and H4K20me3. Two human retrogenes 
showed monoallelic enrichment of active, but not 
of repressive marks suggesting a partial uncoupling 
of the relationship between DNA methylation and 
repressive histone methylation, possibly due to the 
smaller size and lower CpG density of these DMRs. 
Finally, we show that the genes immediately 



flanking the host genes in mouse and human are 
biallelically expressed in a range of tissues, sug- 
gesting that these loci are distinct from large 
imprinted clusters. 

INTRODUCTION 

Genomic imprinting is a form of epigenetic gene regula- 
tion that results in allelic expression dictated by parental 
origin (1). Differential DNA methylation is a major com- 
ponent in regulating this process. Discrete differentially 
methylated c/s-acting regions, known as imprinting 
control regions (ICRs), orchestrate the monoallelic expres- 
sion of numerous genes within imprinted domains and are 
established while the maternal and paternal genomes are 
physically separated in their respective germ lines. To 
date, all imprinted domains are known to contain 
regions of differential DNA methylation (DMRs) that 
are deposited in CpG-rich sequences during oogenesis or 
spermatogenesis by the DNMT3A/DNMT3L de novo 
methyltransferase complex (2-4). A subset of maternally 
DNA-methylated germline DMRs require the activity 
of the amine oxidase domain 1 containing histone 
demethylase AOF1/KDM1. This demethylase is presum- 
ably needed to remove any permissive histone H3 lysine 4 
(H3K4) methylation present at these CpG islands in the 
growing oocytes (5). After fertilization, these regions of 
differential DNA methylation are maintained in somatic 
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tissues by DNMT1 (6), and are associated with numerous 
histone modifications. 

It has previously been shown that DMRs have a con- 
stitutional histone signature that comprises histone H3 
lysine 9 trimethylation (H3K9me3), Histone H4 lysine 
20 trimethylation (H4K20me3) and symmetrical histone 
H2A/H4 arginine 3 dimethylation (H2A/H4R3me2s) 
on the DNA methylated allele (7,8). This is in contrast 
to the enrichment of the transcriptionally permissive 
H3K4me2/3 mark on the unmethylated allele (9). A 
number of genes that are imprinted solely in the mouse 
placenta, have been shown recently to require allelic re- 
pressive histone modifications at their own DNA- 
unmethylated promoters to maintain allelic expression 
(10-13). 

Imprinted genes have diverse evolutionary origins. 
Some imprinted genes are products of retrotransposition 
from parental genes on the X chromosome. Four im- 
printed retrogenes in the mouse — Mcts2, Nap 115, U2afl- 
rsl (also called Zrsrl) and Inpp5f_v2 — are associated with 
DMRs at their promoters, and reside in introns of multi- 
exonic host genes. In all cases, at least one transcript of the 
host is also subject to imprinting. We have previously 
shown that the mouse retrogene Mcts2 influences the 
choice of polyadenylation (polyA) site for transcripts of 
the host gene HI 3 in an allele-specific manner (14). 
Expression of Mcts2 from the paternal allele causes H13 
transcripts to terminate upstream of the retrogene. On the 
maternal chromosome, H13 utilizes downstream polyA 
sites because the Mcts2 DMR is methylated and the 
retrogene silenced. A recent transcriptome-wide analysis, 
using the ultra sensitive RNA-seq technology which is cap- 
able of detecting subtle biases in allelic transcription, has 
suggested that one transcript variant of Inpp5f, the host 
gene of Inpp5f_v2, and Herc3, the Napll5 host gene, are 
also subject to isoform-specific allelic expression (15). 

To investigate whether the human orthologues of the 
X-derived imprinted retrogenes influence allele-specific ex- 
pression of their respective host genes, and whether this 
influence extends to neighboring genes, we have analyzed 
the allelic expression of retrogenes, host and flanking 
genes in humans. The U2afl-rsl gene does not have a 
human counterpart. We find that the human orthologues 
INPP5F_V2, NAP1L5 and MCTS2 are paternally ex- 
pressed in a wide range of fetal tissues, and that their 
promoters are embedded in maternally DNA-methylated 
regions. In humans, the host genes are not subject to im- 
printing, probably due to differing exon^-UTR positions 
in relation to the retrogene integration sites. The genes 
immediately flanking the host genes are biallelically ex- 
pressed in both mice and humans, showing retrogene-host 
pairs do not form parts of larger imprinted clusters. In 
mice, the allelic chromatin of these DMRs conforms to 
the constitutional histone modification signature with 
the repressive modifications H3K9me3, H4K20me3 and 
H2A/H4R3me2s enriched on the DNA methylated 
allele, and the permissive modification H3K4me2 
enriched on the unmethylated allele. These patterns of 
histone modifications are not conserved at the human 
NAP1L5 and MCTS2 promoters, correlating with 
reduced CpG content and CpG island size, which we 



speculate may influence the recruitment of the histone 
methyltransferases (HMTs). 

MATERIALS AND METHODS 

Human tissues 

A cohort comprising 65 fetal tissue sets (8-18 weeks) with 
corresponding maternal blood sample and 96-term placen- 
tal samples are from the Moore Tissue bank and is 
described elsewhere (16). An additional 96 human 
placenta samples were obtained from the Hospital 
St Joan De Deu collection (Barcelona, Spain). Normal 
peripheral blood was collected from adult volunteers 
aged between 19-60-years old. DNA and RNA extraction 
and cDNA synthesis were carried out as previously 
described (11). Ethical approval for adult blood and 
fetal tissue collection was granted by the Hammersmith, 
Queen Charlotte's and Chelsea and Acton Hospital 
Research Ethics Committee (Project Registration 2001/ 
6029 and 2001/6028); Collection of the HSJD placental 
cohort was granted by the ethical committee of Hospital 
St Joan De Deu Ethics Committee (Study number 35/07). 

Cell lines and mouse crosses 

Wild-type mouse embryos and placentas were produced 
by crossing C57BL/6 females with either Mus musculus 
molosinus (JF1) or Mus musculus castaneus (C) male 
mice. RNA and DNA from Dnmt3l~ !+ mice (BxC, 
C57BL/6 mother and Castaneus father) was isolated and 
extracted as previously described (2). The E9.5 Dnmt3l~ l+ 
embryos (BxJ) used to assess Napll5 expression were a 
kind gift from Dr Kenichiro Hata (NRICHD, Okura, 
Tokyo, Japan). The human TCL1 and 2 placental tropho- 
blast cell lines were grown in DMEM supplemented with 
10% FCS and antibiotics. 

Allelic expression analysis 

Genotypes of DNA were obtained for exonic SNPs 
identified in the UCSC browser (NCBI36/hgl8, 
Assembly 2006) by PCR. Sequences were interrogated 
using Sequencher v4.6 (Gene Codes Corporation, MI, 
USA) to distinguish informative heterozygote samples. 
Informative samples were analysed by RT-PCR in corres- 
ponding cDNA using, where possible, intron-crossing 
primers that incorporated the heterozygous SNP in the 
resulting amplicon (Supplementary Table SI). RT-PCRs 
were performed using cycle numbers determined to be 
within the exponential phase of the PCR, which varied 
for each gene, but was between 32-40 cycles. The RT- 
PCRs for HERC3A, HERC3C and both isoforms of 
Abcg2 were analyzed by nested RT-PCR, with the first 
PCR amplified for 25 cycles, with 5 ul of this product 
used as template for the second round PCR which was 
limited to 30 cycles. 

Real-time qRT-PCR 

All PCRs were run in triplicate from the same sample on 
either an ABI Prism 7700 sequence detector or a 7900 Fast 
real-time PCR machine (Applied Biosystems) following 
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the manufacturer's protocol. All primers were optimized 
using SYBR Green amplification followed by melt curve 
analysis to ensure that amplicons were free of primer 
dimer products. Thermal cycling parameters included 
Taq polymerase activation at 95C for lOmin for one 
cycle, repetitive denaturation at 95°C for 15 s, and anneal- 
ing at 60°C for 1 min for 40 cycles. All resulting triplicate 
cycle threshold (C t ) values had to be within one C t of each 
other. The quantitative values for each triplicate were 
determined as a ratio with the level of Gapdh, measured 
in the same sample, with the mean providing relative 
expression values. 

Analysis of allelic DNA methylation 

Approximately 1 ug DNA was subjected to sodium 
bisulphite treatment and purified using the EZ GOLD 
methylation kit (ZYMO, Orange, CA, USA). Bisulphite 
specific primers for each region were used with Hotstar 
Taq polymerase (Qiagen, West Sussex, UK) at 45 cycles 
and the resulting PCR product cloned into pGEM-T Easy 
vector (Promega) for subsequent sequencing. MeDIP was 
performed on 6 jig sonicated (Diagenode Bioruptor) 
genomic DNA with an average size of 150 bp. Samples 
were denatured and incubated with a monoclonal 
antibody against 5-methylcytidine (Eurogentec). 
Immunoprecipitated DNA was then isolated using IgG 
Dynabeads (Dynal Biotech), digested with proteinase K 
and phenol-chloroform extraction was followed by 
ethanol precipitation. MeDIP enrichment was verified by 
duplexed PCR for the methylated SERPIN B5 promoter 
and the unmethylated UBE2B promoter. Southern 
blotting was performed following standard protocols 
using methylation-sensitive restriction enzymes. Digested 
DNA was subjected to agarose gel electrophoresis and 
transferred to Hybond N+ membrane (Amersham). 
Radio-labeled PCR product probes were hybridized 
over-night at 65° C, and subsequently washed in increasing 
stringency SSC/0.1% SDS washes. PCR primers used 
to generate probes are listed in Supplementary Table SI. 

Chromatin immunoprecipitation 

Two adult leukocyte samples and the TCL1 and TCL2 cell 
lines were used in addition to El 8.5 mouse embryos 
for chromatin immunoprecipitation (ChIP). ChIP was 
carried out as previously described (11,13) using the fol- 
lowing Upstate Biotechnology antisera directed against 
H3K4me2 (07-030), H3K9me2 (07-441), H3K9me3 
(060904589), H3K9ac (07-352), H3K27me3 (07-449), 
H4K20me3 (07-463) (Upstate Biotechnology) and H2A/ 
H4R3me2s (Abeam ab5823 97454/520317). ChlPed DNA 
was subjected to allele- specific PCR. Polymorphisms 
within 1 kb of the CpG island were identified by 
interrogating SNP databases or genomic sequencing (see 
Supplementary Table SI for primer sequences and 
location). Only ChIP sample sets that showed enrichment 
for additional ICRs were used in the analysis. 

Precipitation levels in the ChIP samples were 
determined by real-time PCR amplification, using SYBR 
Green PCR kit (Applied Biosy stems). Each PCR was run 
in triplicate and results are presented as fold enrichment 



(comparison to mock) and normalized to the level of pre- 
cipitation at the SNURF-ICR, a control for both active 
and repressive histone modifications located on human 
chromosome 15. 

RESULTS 

MCTS2 does not influence allelic expression of HM13 

The H13 gene on mouse chromosome 2 is known to 
generate at least five transcripts, all originating from a 
single promoter, but differing in polyA site usage (14). 
The utilization of these alternative polyA sites is 
influenced by the paternally expressed Mcts2 imprinted 
retrogene and the CpG island that comprises its DMR. 
The short HI 3d and e transcripts are paternally expressed, 
whereas the HI 3a, b and c transcripts, that extend through 
Mcts2 and the DMR to the canonical polyA site are ma- 
ternally expressed (Figure 1A). To assess allelic expression 
in humans we identified transcribed SNPs unique to each 
isoform, and allele- specific assays were carried out in a 
selection of human first trimester fetal tissues and term 
placentas. The human MCTS2 gene (also known as 
PSIMCT-U MCTSl-pseudogene) is imprinted in a 
variety of fetal tissues (Figure IB and Supplementary 
Table S2). MCTS2 differs from its mouse orthologue as 
it can splice into the last eight exons of the HM13 host 
gene (Figure IB). Sequence analyses revealed that in 
addition to the highly conserved MCTS2 open reading 
frame, this RNA has the potential to be bi-cistronic, 
encoding for a chimeric protein lacking the first 151 
amino acids of HM13, but sharing 243 amino acids in 
the C-terminus. 

The MCTS2 promoter is embedded within a DMR 
(Figure IB and Supplementary Figure SI A), while the 
HM13 host gene originates from a CpG island that is 
unmethylated in all the tissues analysed. Alignment of 
human expressed sequence tags (ESTs) revealed a 
number of transcripts. The expression of the human 
short HM13D isoform (Genbank NM_1 78982), which ter- 
minates prior to the transcriptional start site (TSS) of 
MCTS2, and the full-length transcript HM13C 
(Genbank NM_030789) are biallelic in all fetal tissues 
analysed (Figure IB and Supplementary Table S2). 

The NAP1L5 promoter is not within a CpG island, 
but is a DMR 

The paternally expressed Nap 115 retrogene was first 
identified in a genome- wide screen for differential DNA 
methylation, and was reported to be predominantly pater- 
nally expressed in mouse brain (17). The host gene, Herc3, 
gives rise to a number of transcript isoforms. At least two 
short isoforms (Herc3b and c) are expressed from the 
paternal allele in mouse brain [A.J. Wood and R.J. 
Oakey, unpublished data, (15)]. The full length Herc3a 
transcript has a maternal expression bias, similar to the 
full length HI 3 isoforms (14). We detect paternal expres- 
sion of NAP1L5 in all fetal tissues analyzed. The human 
HERC3 gene also contains one long (HERC3A, Genbank 
NM_0 14606) and two short isoforms (HERC3B, 
Genbank BC038960; HERC3C, Genbank AK296397). 
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Figure 1. (A) Map of the HI 3 locus located on mouse chromosome 2, showing the location of the various imprinted transcripts and CpG islands 
(red transcripts are maternally expressed, blue are paternally expressed and grey are expressed from both parental alleles. Arrows represent direction 
of transcription). (B) Schematic of the human HM13 gene on chromosome 20, showing the distribution of exons and insertion of MCTS2 into intron 
4. The methylation status of the HM13 promoter CpG island and MCTS2 CpG island were examined by bisulphite PCR. Each circle represents a 
single CpG dinucleotide and the strand. Filled circle, a methylated cytosine; open circle, unmethylated cytosine. The sequence traces show allelic 
expression for MCTS2 and HM13 isoforms (for clarity only sequence traces for MCTS2 BC053868 are shown). (C) A map of the Herc3 domain on 
mouse chromosome 6. (D) The human NAP1L5 gene and the insertion into intron 22 of HERC3. (E) A schematic map of the Inpp5f gene on mouse 
chromosome 7, and (F) the orthologous region on human chromosome 10. 
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Our allele-speciflc assays are suggestive of biallelic expres- 
sion of all isoforms in all tissues including brain (Figure 
1C and D, and Supplementary Table S2), although our 
methodology is not directly comparable with RNA-seq, 
which can detect subtle expression biases. The HERC3 
transcripts initiate from an unmethylated CpG island, 
whereas the NAP1L5 promoter is DNA methylated on 
the maternal allele, even though the region is not statistic- 
ally a CpG island in humans (Figure ID). 

INPP5F_V2 is imprinted in numerous human tissues 

We previously identified a neural-specific, paternally ex- 
pressed Inpp5f_v2 transcript using expression microarrays 
(18). Like the other murine imprinted retrogene loci, the 
host gene Inpp5f exhibits isoform-specific expression, with 
a maternal expression bias in a truncated shorter tran- 
script (Genbank AK039468) (15). In addition, another 
transcript, Inppf5_v3, arising from a different promoter, 
is paternally expressed in mouse brain (19,20). The 
genomic organization of the human locus resembles that 
of the mouse, however, there is no evidence for a human 
Inpp5f_v3 orthologue (Figure IE and F). The promoter of 
the human INPPF5V2 transcript is embedded within a 
maternally DNA-methylated DMR (Figure IF and 
Supplementary Figure SIB), resulting in paternal expres- 
sion in a wide range of tissues (Supplementary Table S2). 



The full-length host gene transcripts (Genbank 
AB023183), and the truncated isoform (Genbank 
BC052367), originate from an unmethylated CpG island 
and are biallelically expressed. 

Retrogenes-host pairs do not form part of larger 
imprinting clusters 

In order to determine the boundaries of imprinting at each 
retrogene-host locus, we investigated the allele- specific ex- 
pression of the genes flanking the host genes in both mice 
and humans. We assessed expression in various embryonic 
tissues and placenta using allele-speciflc assays between 
crosses of mouse strains C57BL/6 (B) x Mus musculus 
castaneus (C). The genes immediately adjacent to Mcts2\ 
H13, Idl and Remi, are biallelically expressed in all tissues 
at embryonic day E18.5, as are 1 1 100 7 A 1 3RIK and Bag3 
flanking Inpp5f_v2jlnpp5f. The Faml3a gene, telomeric to 
Napll5/Herc3 is expressed from both alleles, while Abcg2, 
centromeric to Napll5/Herc3, is monoallelically expressed 
in the placenta. This reflects a bias in expression from the 
C57BL/6 allele and the gene is therefore an expressed 
quantitative trait locus (eQTL) and not imprinted 
(Figure 2 A and Supplementary Figure S2). This conclu- 
sion is supported by the persistence of Abcg2 monoallelic 
expression in Dnmt3l~ l+ placental trophoblast, despite the 
loss of imprinting of Napll5 (Figure 2B). 
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Figure 2. (A) A schematic map of the Abcg2 gene, with the location of the alternative promoter regions. The methylation status of the CpG island 
associated with isoform 1 was examined in placenta-derived DNA. The allelic expression of Abcg2 is assessed in various fetal tissues in reciprocal 
mouse crosses. (B) The allelic expression of Abcg2 and Napll5 in placental trophoblasts from Dnmt3l~ l+ mice. (C) The allelic expression of the 
ABCG2 gene in human term placenta. 
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To assess the allelic origin of expression for the 
orthologous flanking genes in the human, we assessed 
the expression of REMI, ID1, BAG 3, CORF119 and 
FAM13A/FAM13AOS. These genes are expressed 
biallelically in all fetal tissues (Supplementary Table S3 
and Figure S3). In higher primates, including rhesus 
monkey, orangutan, chimps and humans, the organization 
of the genes centromeric to NAP1L5/HERC3 is different 
to that of mouse and rat, resulting in the gene PIGY 
being immediately centromeric to NAP1L5/HERC3, and 
ABCG2 ~300kb away. In all human fetal tissues 
analyzed, including placenta, both PIGY and ABCG2 
are biallelic (Figure 2C and Supplementary Figure S3). 
These finding strongly suggest that imprinted 
retrogene-host pairs do not form part of larger imprinting 
clusters. 

Histone modification and allelic repression at imprinted 
retrogene DMRs 

In mouse the imprinted expression of Nap 115 and 
Inpp5f_v2 is restricted to the brain (17-19), whereas in 
humans, imprinted expression is observed in a wider 
variety of fetal tissues (Figure 1, Supplementary Table 
SI and Figure S4). To date all germline DMRs are 



differentially methylated in all somatic tissues, indicating 
that DNA methylation on its own is not responsible 
for tissues-specific differences in expression observed at 
imprinted loci. To investigate whether the discrepancy 
in expression profiles we observe between species could 
be attributed to histone modifications, we analyzed the 
allelic enrichment of both permissive and repressive 
histone modifications. Our analysis focused on modifica- 
tions on histone H3 and H4, including acetylation of 
lysine-9 (H3K9ac) and H3K4me2 as markers of active 
chromatin; and the repressive marks of H3K9me3 and 
H3K27me3 of histone H3, along with the histone H4 
modifications H4K20me3 and H2A/H4R3me2s. 

ChIP was performed on native chromatin from brain 
and decapitated embryos for both BxC and BxJFl crosses. 
We ascertained allelic enrichment using polymorphisms 
mapping within the DMRs of Napll5, Inpp5f_v2 and 
Mcts2. The active modification H3K4me2 was strongly 
enriched specifically on the unmethylated paternal allele 
for all three DMRs in both brain and embryo, while 
most H3K9ac precipitation was predominantly in brain 
(Figure 3), the tissue in which these genes are expressed. 
The same regions showed precipitation of the repressive 
marks H3K9me3, H4K20me3 and H2A/H4R3me2s on 
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Figure 3. (A) The allelic precipitation of the mouse retrogene DMRs in embryo and brain tissues. Native ChIP followed by PCR and restriction 
digest-mediated allelic discrimination of the input, antibody bound (B) and unbound (U) chromatin fractions on BxJ embryos and brains for Napll5 
and Inpp5f_v2, and on BxC embryos and brains for Mcts2. The asterisks represent a relative allelic enrichment of > 3-fold compared to the unbound 
fraction. 
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the DNA methylated maternal allele (Figure 3). The re- 
pressive H3K27me3 mark showed different profiles for 
the three mouse DMRs. This modification did not show 
allelic enrichment at the Inpp5f_v2 DMR in either brain 
or embryo, but was precipitated on the DNA methylated 
maternal allele at the Mcts2 DMR. At the Nap 115 DMR, 
we observed that H3K27me3 was precipitated on the 
unmethylated paternal allele in embryo but not in brain 
(Figure 3). This pattern of enrichment is reminiscent of the 
monoallelic bivalent chromatin domain reported at the 
GrblO/GRBlO gene (21,22). This monoallelic bivalent 
conformation is not detected at the Napll5 DMR in 
brain, suggesting that the removal of H3K27me3 is con- 
comitant with the paternal expression observed for Napll5 
in mouse brain. 

Extensive genotyping of the human NAP1L5, MCTS2 
and INPP5F_V2 DMRs revealed that SNPs in these regu- 
latory regions are rare. However, we were able to identify 
heterozygous samples that allowed us to discriminate 
between alleles. The SNP rs2972011 is located -200 bp 
from the TSS of NAP1L5, whereas rs7907781 and 
rsl 115713 are -600 bp and -50 bp from the TSS of 
INPP5F_V2 and MCTS2, respectively. To ensure that 
these SNPs mapped within the DMRs, we performed 
DNA methylation immunoprecipitation (meDIP) using 
antisera directed against 5-methylcytosine. This was due 
to the difficultly in amplifying bisulphite converted DNA 
in the vicinity of SNPs rs7907781 and rsl 115713. For all 
three regions we observed monoallelic enrichment in 



heterozygous placental DNA samples, and where inform- 
ative, the DNA methylation was detected on the maternal 
allele (Figure 4A). Using these same amplification condi- 
tions, we performed ChIP on native chromatin isolated 
from adult peripheral blood leukocytes and from two 
human placental cell lines, TCL1 and TCL2 (23) for the 
NAP1L5 and MCTS2 DMRs. Unfortunately, no hetero- 
zygous cell lines could be found that were informative for 
INPP5F V2, despite genotyping of over 140 leukocyte 
samples and normal tissue cell lines. Similar to the 
mouse, we observe strong monoallelic enrichment for 
H3K4me2 at the NAP1L5 and MCTS2 DMRs, but since 
no parental DNA samples were available, allelic origin 
could not be assigned. Unexpectedly, we did not observe 
allelic precipitation for any of the repressive histone marks 
at these DMRs, despite strong allelic enrichment at the 
SNURF/ SNRPN, H19 and MEST DMRs (Figure 4B; 
data not shown). To confirm that the histone modifications 
were present at the NAP1L5 and MCTS2 DMRs, we per- 
formed quantitative ChIP analysis on the placental cell 
line TCL1 (Figure 4C). The precipitation values obtained 
were normalized to those for the SNURF/SNRPN DMR, 
which revealed that H3K4me2 is more abundant at the 
MCTS2 and NAP1L5 promoters, whereas, the repressive 
histone modifications were precipitated several fold less. 
Interrogation of human histone maps [http://dir.nhlbi.nih 
.gov/papers/lmi/epigenomes/hgtcell.aspx, and (24)] con- 
firmed the absence of significant enrichment for these 
repressive marks at 1-2 nucleosomes resolution 
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(the approximate size of the MCTS2 and NAP1L5 DMRs), 
despite strong H3K4me2 enrichment. Together, these data 
suggest that these repressive marks are not present, or very 
low, in these two regions. 



DISCUSSION 

In this study, we have shown the paternal allele- specific 
expression of INPP5F_V2, MCTS2 and NAP1L5 in a 
wide range of human tissues. This is in contrast to the 
mouse, where Inpp5f_v2 and Napll5 show spatial expres- 
sion restricted to brain. We show that the TSSs for these 
three genes are embedded in regions of differential DNA 
methylation, similar to their mouse orthologues. 

In the mouse, Mcts2 and Napll5, like Inpp5f_v2 are 
organized such that the imprinted 'intronic retrogene' 
and its DMR/promoter reside in the intron of another 
gene known as the 'host'. In each case the retrogene 
originated from an ancestoral gene on the X chromosome 
some time in early eutheraian evolution, since Inppf5_v2, 
Mcts2 and Nap 115 are absent in marsupials (19). 
Retrotransposition has also been linked to the imprinting 
of the RBI gene, however this event occurred much later 
in mammalian evolution as only humans, chimpanzee and 
rhesus monkeys, but not mice and rat have the processed 
RB1/KIAA0649 pseudogene (25). A recent high-resolution 
analysis of parent-of-origin allelic expression in mouse 
brain has revealed that the host genes Herc3 and Inpp5f 
show evidence for allele-specific alternative polyA choice 
similar to that of H13 [A.J. Wood and R.J. Oakey, 
unpublished data and (14,15)]. 

To assess whether the human orthologues MCTS2, 
NAP1L5 and INPP5F V2 are also associated with im- 
printed host genes, we analyzed the allelic expression of 
the host genes in a wide range of fetal tissues and term 
placentas. We observe in all cases, that the host transcripts 
are biallelically expressed. The genomic location of the 
host gene exons are different. In the mouse, the polyA 
signal for the paternally expressed H13d isoform maps 
to within 100 bp of the Mcts2 DMR, whereas these two 
features are separated by >7kb in humans. Genomic re- 
arrangements are more pronounced at the Nap 115/ 
NAP1L5 locus, where Herc3b and Nap 115 are separated 
by 1 .5 kb in the mouse, and by more that 39 kb in humans. 
In addition, the mouse Herc3c isoform initiates from 
within the Nap 115 DMR, whereas the human promoter 
for this isoform is adjacent to the HERC3B polyA 
(Figure ID). These differences in exon distribution could 
explain the lack of host gene imprinting observed in 
humans. We have previously proposed that transcription- 
al interference may be involved in alternative polyA choice 
at H13, due to the high expression of Mcts2 in brain 
directly inhibiting the transcription of H13 on the 
paternal allele in co-expressing cells. Alternative models, 
including the recruitment of methylation-sensitive polyA 
factors, or the association with the CTCF/cohesin 
boundary complex to the unmethylated paternal allele of 
the Mcts2, could result in similar allelic-termination of the 
host genes on the maternal allele (14). For any of the 
models, the imprinting of host gene transcripts may 



require a close physical proximity of the host gene exons 
with the retrogene, which is not observed in humans. 

Genes flanking the retrogene-host pairs are biallelically 
expressed 

Many imprinted genes are clustered in the genome, such 
that their expression is influenced by shared control 
elements, such as DMRs. The imprinted retrogene-host 
pairs are located outside of characterized clusters. To 
confirm this, we assayed flanking genes for expression 
status and found all to be biallelic, with the exception of 
Abcg2 which is monoallelically expressed in the placentae 
from reciprocal BxC and CxB sub-species intercrosses, but 
not subject to imprinting. 

It has recently become evident that non-imprinted 
monoallelic expression can result from SNP-associated 
DNA methylation (26). To assess whether the Abcg2 
eQTL is due to monoallelic DNA methylation similar to 
that described in humans, we showed that the promoter 
CpG island of Abcg2 is fully unmethylated (Figure 2 and 
data not shown) suggesting that another mechanism is 
regulating the allelic expression. Recently, Brideau et al. 
(27) reported that the AK006067 transcript next to the 
imprinted Rasgrfl gene is expressed > 100-fold higher 
from the C57BL/6 allele than the PWK allele. Both our 
finding of the Abcg2 eQTL and the observations of 
Brideau et al. (27), emphasize the necessity to study recip- 
rocal Fl crosses. 

Histone modification signatures at imprinted retrogene 
DMRs 

Recent studies have suggested that there is a link between 
DNA and histone methylation at imprinted DMRs. A 
comprehensive analysis of allelic histone modifications in 
Dnmt3l~ l+ conceptuses revealed that without oocyte- 
derived DNA-methylation imprints, there is a dramatic 
effect on the presence of repressive histone modifications, 
with maternally DNA methylated DMRs adopting a 
paternal epigenotype (7). We explored whether the chro- 
matin at Nap 115, Mcts2 and Inpp5f_v2 is associated with 
the plethora of histone modifications known to be 
enriched at DMRs. In agreement with our previous obser- 
vations, we found that the DNA-methylated alleles of 
each DMR are enriched for the repressive histone marks 
H3K9me3, H4K20me3 and H2A/H4R3me2s. Unlike 
these three modifications, H3K27me3 was not consistently 
associated with the DNA-methylated allele, which agrees 
with earlier chromatin studies on ICRs. At the Napll5 
DMR, we confirm the previous observation of bivalent 
chromatin in mouse embryos (7), with H3K4me2 and 
H3K27me3 both enriched on the paternal allele. This 
monoallelic bivalent domain behaves in a similar fashion 
to the recently described monoallelic bivalent chromatin at 
the GrblO DMR (21). Like their non-imprinted bivalent 
counterparts, these genes are associated with 'poised' 
lineage- specific transcription in mouse and human ES 
cells (28). In brain, we observe absence of allelic enrich- 
ment for H3K27me3, and this correlates with acquired 
expression of Nap 115, a mechanism comparable to 
what we have reported previously at the GrblO (21,22). 
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These observations suggest that monoallelic bivalent 
chromatin could be a common mechanism conferring 
brain-specific imprinted gene expression. 

To confirm the conservation of these histone modifica- 
tions at the regions we identified as DMRs in humans, we 
performed ChIP on leukocytes and placenta cell lines. To 
our surprise we did not find significant enrichment of the 
repressive histone marks H3K9me3 and H4K20me3 on 
the DNA-methylated allele. This uncoupled action of the 
DNA-methylation machinery and the H3K9me3 and 
H4K20me3 HMTs may be partially explained by the pro- 
gressive decrease in CpG island size and CpG density at 
the NAP1L5 and MCTS2 DMRs compared to other im- 
printed DMRs in the genome (Supplementary Table S4). 
Throughout mammalian evolution, the NAP1L5 and 
MCTS2 DMRs have lost approximately half of their 
CpG dinucleotides compared to mouse. Both DMRs 
have diminished in size, with a reduction from > 420 bp 
to 216 bp for MCTS2. The human NAP1L5 DMR fails to 
reach standard CpG island criteria (GC content >50%; 
Obs CpG/Exp CpG >0.6; min length 200 bp). The loss in 
CpG island size at the NAP1L5 DMR is due to a com- 
bination of CpG deamination and the integration of 
numerous CpG low-density repeat elements, including 
DNA-MER115, LINE-1 and low-complexity CT-repeats 
immediately downstream of the transcription start site. 
The human and mouse genomes have recently been 
shown to contain a similar number of CpG islands (29). 
Our observation of an evolutionary loss of CpG density at 
the NAP1L5 and MCTS2 DMRs goes against the general 
trend. Additionally, for many imprinted DMRs analyzed 
in humans (^vDMRl, GRB10, MEST, ZAC1, NDN, 
GNAS EX1A, GNAS XL, PEG3, PEG 10, NNAT and 
IGE2R/AIR), the size of the CpG islands comprising the 
DMRs are all larger in humans than in mouse 
(Supplementary Table S4). The low-CpG density within 
the promoters of NAP1L5 and MCTS2 may mean that 
these discrete DMRs go unrecognized by the non-histone 
proteins including the HMTs for H4K20me3 and 
H3K9me3, respectively (8,12). Overall these observations 
are in agreement with the recent studies suggesting that 
DNA methylation at functional imprints require DNA 
methylation before the acquisition of repressive histone 
methylation (7), and that deficiencies in repressive 
histone marks do not have a direct role in the regulation 
of DNA methylation at ICRs (12,13,30). 



CONCLUSIONS 

In summary, we have shown that the human orthologues 
of mouse imprinted retrogenes are paternally expressed in 
a wide range of fetal tissues. In mice, the host genes are 
subject to alternative polyadenylation, presumably as a 
consequence of retrogene integrations that acquired im- 
printing in proximity to weak polyA signals. In humans, 
we show that retrogene promoters are subject to 
allele-specific CpG methylation, but internal polyA sites 
of host genes are situated further upstream of the DMRs 
and thereby escape their influence. In mice, these DMRs 
are associated with allelic repressive histone modifications. 



At the mouse Napll5 promoter, monoallelic bivalent 
chromatin i.e. the enrichment of both H3K4me2 and 
H3K27me3 on the same allele, is associated with the 
unmethylated paternal allele. In humans, an evolutionary 
deterioration in CpG island size correlates with a lack 
of allelic H3K9me3 and H4K20me3 precipitation at 
NAP1L5 and MCTS2 DMRs indicating imprinted gene 
expression in the absence of the an ICR-specific histone 
signature. 
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